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Abstract 

Background: Interspecific hybridization creates individuals harboring diverged genomes. The interaction of 
these genomes can generate successful evolutionary novelty or disadvantageous genomic conflict. Annual 
sunflowers Helianthus annuus and H. petiolaris have a rich history of hybridization in natural populations. Although 
first-generation hybrids generally have low fertility, hybrid swarms that include later generation and fully fertile 
backcross plants have been identified, as well as at least three independently-originated stable hybrid taxa. We 
examine patterns of transcript accumulation in the earliest stages of hybridization of these species via analyses 
of transcriptome sequences from laboratory-derived F1 offspring of an inbred H. annuus cultivar and a wild 
H. petiolaris accession. 

Results: While nearly 14% of the reference transcriptome showed significant accumulation differences between 
parental accessions, total F1 transcript levels showed little evidence of dominance, as midparent transcript levels 
were highly predictive of transcript accumulation in F1 plants. Allelic bias in F1 transcript accumulation was 
detected in 20% of transcripts containing sufficient polymorphism to distinguish parental alleles; however the 
magnitude of these biases were generally smaller than differences among parental accessions. 

Conclusions: While analyses of allelic bias suggest that cis regulatory differences between H. annuus and 
H. petiolaris are common, their effect on transcript levels may be more subtle than trans-acting regulatory 
differences. Overall, these analyses found little evidence of regulatory incompatibility or dominance interactions 
between parental genomes within F1 hybrid individuals, although it is unclear whether this is a legacy or an 
enabler of introgression between species. 
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Background 

For organisms that reproduce sexually, biological fitness 
requires the successful interaction of maternal and pa- 
ternal genomes within the new individual. While these 
interactions may take place at various points along the 
path from DNA to external phenotype, analyses of tran- 
script accumulation currently provide the strongest tech- 
nology to detect these interactions on a genome-wide 
scale. Changes in transcript levels are hypothesized to 
enable response to selective forces in novel environ- 
ments [1-3]. Alteration of single components of regulatory 
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machinery may have dramatic effects on transcript profiles 
[4]. We therefore expect that bringing together two 
sets of regulatory machinery that have been separated 
for millions of years may lead to novel patterns of tran- 
scription that contribute to novel phenotypes in interspe- 
cific hybrids. 

In plants, the effect of inter-species hybridization on 
transcript levels has been most extensively studied in al- 
lopolyploids, where hybridization occurs in conjunction 
with genome doubling. Comparison of allopolyploids 
with autopolyploids in several systems has provided 
evidence that hybridization has more dramatic effects 
on transcript phenotypes than increased ploidy [5-7]. 
In some cases, polyploidization following hybridization 
has been proposed as a mechanism of moderating novel 
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transcript phenotypes generated by regulatory divergence 
between parental genomes [8]. Extreme gene expres- 
sion changes following hybridization, or "transcriptional 
shock", have been described in early-generation allopo- 
lyploid hybrids of Arabidopsis, wheat, and cotton, as well 
as diploid Senecio spp. hybrids [9-14]. While in later- 
generation hybrids and back-crosses, changes in gene 
expression may be caused by genome rearrangement, seg- 
regation of parental alleles, or environmentally- mediated 
selection on accumulated mutation, transcription in first 
generation (Fl) hybrids will be controlled by interaction 
between parental genomes mediated by transcriptional 
machinery. Non-additive Fl transcriptional phenotypes 
may be caused by differences between parental species at 
the transcribed locus {cis effects) or differences in trans- 
acting regulatory factors. In hybrids, parental genomes are 
exposed to a common pool of trans-acting factors, and 
analyses of allelic bias, or differential parental genome 
contributions to accumulated transcript, can provide in- 
sight into the relative contributions of cis and trans effects 
to inter-specific gene expression differences [15,16]. 

The sunflower genus Helianthus is native to North 
America and contains 49 species of annual or perennial 
herbs. The annual sunflowers form a distinct and well- 
supported clade containing eleven species, including the 
widely-distributed species H. annuus and H. petiolaris. 
These species likely originated in allopatry, but their 
current ranges show considerable overlap. Cytological 
studies and genetic maps constructed from interspecific 
crosses suggest that chromosomal rearrangements have 
accumulated since the evolutionary separation of H. 
annuus and H. petiolaris. These species are also sepa- 
rated by differences in morphology, life history and habi- 
tat preference, and show poor pollen viability in hybrid 
offspring [17-19]. 

Although H. annuus and H. petiolaris are estimated to 
have diverged from each other nearly 2 million years ago 
(Sambatti et al. 2012), they have been observed to 
hybridize in natural settings [20,21]. Average divergence 
between H. annuus and H. petiolaris is estimated to 
range from Fst = 0.19 (based on microsatellite variation) 
to Fst = 0.3 (based on sequence polymorphism in tran- 
scripts), similar to levels of intraspecific divergence 
among stickleback populations and between human pop- 
ulations from West Africa and East Asia [22-24]. This 
relatively low divergence is consistent with analyses of 
single-gene phylogenies that suggest substantial recent 
introgression between H. annuus and H. petiolaris [22]. 
In at least three cases, hybridization between H. annuus 
and H. petiolaris has led to the formation of distinct 
hybrid species (H. anomalus, H. deserticola, and H. pa- 
radoxus), which occupy extreme habitats (active sand 
dunes, desert, and salt marshes respectively). It has been 
hypothesized, with experimental support, that hybrids 



bearing genotypes associated with phenotypic traits and 
environmental tolerances outside of the range exhibited 
by either parental species were able to colonize unusual 
ecological niches and form new species [25,26]. 

Hybrids between H. annuus and H. petiolaris have also 
been created for research and agricultural purposes. 
Most prominently, H. petiolaris is the source of cyto- 
plasmic male sterility PET1, widely used in commercial 
sunflower hybrid production [27]. H. petiolaris is a po- 
tential source of useful germplasm for improvement of 
H. annuus cultivar resistance to stresses, particularly os- 
motic stresses such as drought and saline soils. 

Here we investigate patterns of transcript accumu- 
lation in hybrid sunflowers generated from controlled 
crosses of Helianthus annuus (cmsHA89) with H. petio- 
laris (Pet2152). We find that the majority of transcripts 
accumulate to intermediate levels in the Fl hybrid, and 
moreover, that mean transcript levels across parental 
accessions are highly predictive of transcript levels ob- 
served in Fl hybrids. Few transcripts showed accumula- 
tion outside of the range observed in parental accessions. 
Within Fl individuals, bias in accumulation of parental al- 
leles was detected in 20% of transcripts where parental al- 
leles could be reliably distinguished, but the magnitude of 
differences in accumulation were generally lower than dif- 
ferences observed between parental accessions. These re- 
sults suggest that both cis and trans regulatory divergence 
contribute to interspecific differences in transcription, yet 
H. annuus and H. petiolaris genomes show relatively few 
instances of "misregulation" or extreme phenotypes at the 
transcript level. 

Methods 

Plant growth and generation of H. annuus x H. petiolaris 
hybrids 

We used a cultivated accession of H. annuus, rather than 
a wild accession that might more closely represent the 
parents of homoploid hybrid sunflowers, for several 
practical reasons, described below: (1) Male sterility and 
distinct morphology of the H. annuus cultivar used, 
cmsHA89, provided better recovery and identification of 
hybrids than would be expected from crosses involving 
wild H. annuus. (2) cmsHA89 is not self-incompatible, 
but requires a pollen donor to produce viable seed. This 
reduced the chances of self fertilization due to mentor 
effects in mixed pollen loads (Desrochers et al. 1998). 
cmsHA89 sterility is conferred by the PET1 cytoplasm; 
while the ultimate origins of this cytoplasm remain 
unclear, it was introduced into H. annuus cultivated 
lines via introgression from H. petiolaris. However, the 
cmsHA89 cytotype is extremely rare in natural popula- 
tions of Helianthus (Rieseberg et al. 1994). (3) The large 
heads of H. annuus cultivars provide greater potential 
seed yield per single cross than wild accessions. (4) The 
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relatively homozygous genome of cmsHA89 also pro- 
vided greater power to identify variants between parental 
genomes and assign parentage to alleles within hybrid 
offspring. This last factor is particularly important as H. 
petiolaris is intolerant to inbreeding and inbred lines of 
H. petiolaris are not available. 

Plants of H. annuus cultivar cmsHA89 (USDA PI 
650572) and wild H. petiolaris accession Pet2152 (USDA 
PI 586920) were grown in one gallon pots under stan- 
dard greenhouse conditions at the University of British 
Columbia Botanical Garden Nursery. Before they began 
to open, cmsHA89 flowers were covered with drawstring 
organza bags to deter unauthorized pollination. When 
the anther filaments of at least the outer three rings of 
florets were exposed, cmsHA89 flowers were pollinated 
with pollen from a single Pet2152 plant and re-covered. 
Reciprocal crosses were not performed because cmsHA89 
does not produce pollen. At the same time, self-incompa- 
tible Pet2152 plants were intercrossed. Seed heads were 
allowed to mature and dry before removal from the plant. 

While crosses between H. annuus and H. petiolaris are 
generally of poor fertility, one cmsHA89 x Pet2152 cross 
produced approximately 200 mature seeds. Fl seeds 
(n = 60) were scarified (removal of the top 1/3 of the 
seed) to improve synchrony of germination and placed 
on moist filter paper disks in plastic petri dishes in the 
dark at approximately 25°C. cmsHA89 and Pet2152 
(half-siblings of the Fl seedlings) were treated similarly. 
After 3 days, seedlings were transferred to soil in 32-cell 
nursery flats in a controlled environment chamber (16:8 
lightdark, 50% RH, 28°C). After 3 additional weeks, plants 
were transplanted into 1 -gallon pots and moved to a 
greenhouse bench. 

The hybrid identity of putative Fl plants was con- 
firmed via examination of external phenotypes and mo- 
lecular markers. Visible phenotypic markers included 
pigmentation at the base of the stem, leaf shape, plant 
branching, and production of foliar glandular trichomes. 
Molecular marker phenotypes were observed by ex- 
traction of genomic DNA and amplification of two loci 
previously determined to differ in size (distinguishable 
by agarose gel electrophoresis) between cmsHA89 and 
Pet2152. PCR primers, amplification conditions, and re- 
presentative gel images are provided in Additional file 1: 
Supplemental Methods 1. 

mRNA extraction and sequencing 

At 45 days post-germination, leaf tissue was collected 
from 8 Fl plants and 2 plants from each parent ac- 
cession. The youngest fully-expanded leaf was cut from 
each plant, placed into a 50 ml conical tube, and imme- 
diately frozen in liquid nitrogen. Total RNA was 
extracted from approximately 50 mg of ground tissue as 
described [28]. Preparation of non-normalized cDNA 



libraries and whole transcriptome shotgun sequencing 
(RNA-Seq) via Illumina HiSeq 2000 were performed at the 
Michael Smith Genome Sciences Centre in Vancouver, 
British Columbia, Canada (http://www.bcgsc.ca/services). 
Samples were multiplexed with 3 samples per lane. 

Sequence data processing and analysis 

Paired-end, lOObp RNA-Seq reads (chastity > 0.6) were 
aligned to a H. annuus -derived transcriptome reference 
[29]. This reference, assembled from 93428 EST sequen- 
ces [30,31], consists of 16312 unique contigs with a total 
length of 17.062 million bases. Fasta-formatted sequence 
for the transcript reference is available at datadryad.org 
[29]. The median insert size between paired-end reads 
ranged from 131 to 151 bases per sample. Approxi- 
mately 52% (8559) of reference contigs were assigned to 
genetic map positions within the H. annuus genome via 
identity to sequenced markers appearing on a map of 
H. annuus derived from recombinant inbred lines from 
the population RHA280 x RHA801 (Renaut et al. in re- 
view, [32]) (Figure 1). Genetic map positions assigned to 
the transcript reference are available as Additional file 2: 
Table S3. Alignments were performed using the Burrows- 
Wheeler Aligner (BWA) tools aln and sampe using a 
maximum insert size of 1000 and a quality filter of 30 to 
trim reads [33]. Aligned BAM files were sorted and PCR 
duplicates removed using SAMtools utilities sort' and 
rmdup' [34] . Reads per contig were counted for each sam- 
ple using coverageBed [35]. 

Read counts were analyzed in R using the DESeq pac- 
kage to compare counts of reads aligned to a given refe- 
rence contig [36]. The DESeq package uses a modified 
Fisher s exact test with data fit to a negative binomial 
distribution to test for pair-wise differences in count 
data between sample classes, allowing within- transcript 
comparisons across a broad dynamic range. Three pair- 
wise comparisons were performed to identify contigs 
that showed differences in accumulated mapped transcript 
reads between H. annuus cmsHA89 and H. petiolaris 
Pet2152, cmsHA89 and Fl samples, and Pet2152 and 
Fl samples. 

Per-contig read counts were averaged across paren- 
tal accession samples (cmsHA89, Pet2152) to generate 
mean parent values, and across samples from Fl hybrid 
plants. Linear modeling of mean transcript counts from 
Fl plants as a function of mean parent transcript levels 
was performed in R. Examination of residuals and leve- 
rage estimates for this model led us to remove 3 contigs 
with values of Cooks D exceeding 1. Refitting the model 
without these points did not significantly alter the par- 
ameter estimates for the model. Predictions of hybrid 
transcript values, with 99% confidence intervals, were 
generated using this model. Reference contigs showing 
mean hybrid transcript accumulation outside the confines 
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Figure 1 Allelic bias in transcript accumulation within F1 hybrid plants. The x-axis indicates genetic position along consecutively ordered 
chromosomes of the H. annuus genome (Additional file 2: Table S3); chromosome borders are delineated in the black and white bar labeled 
"CHR". "total reads mapped" provides the sum of sequence reads (among 12 samples) assigned to each position, "fixed cmsHA89-Pet2152 SNP" 
identifies the location of variants used to assign allelic origin (see Methods), "species differences" shows location of contigs showing significant 
differences in transcript accumulation between H. onnuus and H. petioloris samples, "allele differences" shows the position of contigs identified as 
showing significant differences in accumulation of parental alleles in F1 hybrid samples. Bars labeled "ALL" and "SPP" show the summed direction 
of significant parental differences or allelic bias for that genetic map location; red indicates that the H. onnuus samples or alleles show higher 
transcript accumulation, blue indicates that H. petioloris samples or alleles show higher transcript accumulation. 



of the confidence interval for predictions were classed as 
"non-additive". 

Variance among hybrids 

Variability in transcript levels among individual Fl hybrids 
was assessed by calculating the coefficient of variation 
(CV) for each reference contig. To reduce bias in the esti- 
mates of CV due to non-normal distribution of transcript 
level estimates, read counts were first subjected to a nat- 
ural log transformation and CV was calculated using the 



formula CV = sqrt((e a(ln) ) 2 - 1), where a(ln) is the sample 
standard deviation calculated from the log-transformed 
hybrid transcript values. Contigs with a CV greater than 2 
were considered to have high variance among hybrid 
plants. 

Allelic bias in transcript accumulation 

Data from the four sequenced parental accession sam- 
ples (HA89.5, HA89.9, PET.2, PET.3) were analyzed sim- 
ultaneously using SAMtools mpileup' to identify single 
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nucleotide polymorphisms (SNPs) with respect to the 
reference sequence. We used a custom perl script to ex- 
tract loci meeting the following criteria: 1) the variant 
allele frequency * 1 (this criterion excludes sites where 
samples differ from the reference, but not between 
accessions), 2) phred quality score > 80, 3) a single al- 
lele is detected within each accession, and 4) the number 
of sequence reads covering the position is > 5 for each 
sample (Additional file 3: Supplemental Methods 2). This 
final criterion eliminates potential false discovery of allelic 
bias due to failure of one parental allele to align to the re- 
ference transcript set, yet also eliminates sequences that 
are not transcribed (at a detectable level under our condi- 
tions of growth and sampling) in one parent genome that 
may show true allelic bias in the Fl offspring. 

For each qualifying variant position within a contig, 
read depth per SNP was determined for both H. annuus 
and H. petiolaris-denved variants within individual Fl 
transcript sequence datasets. From SAMtools 'mpileup' 
output for individual hybrid plants, we extracted 'dp4' 
(read depth for: reference allele on the forward strand, 
reference allele on the reverse strand, alternate allele on 
the forward strand, alternate allele on the reverse strand) 
at each target site, and combined forward and reverse 
read counts to determine per-allele read depth. At this 
stage we also removed variants that were only detected 
on one direction of sequence read (either forward or re- 
verse), as these are likely to represent sequencing ar- 
tifacts. Read counts for each variant were compared 
across Fl samples using DESeq [36]. For later gene-level 
analyses, significant SNP within the same contig were 
considered as a single significantly-differing transcript, 
with allelic bias estimated as an average of differences in 
read counts per SNP and positions showing inconsistent 
results (i.e. one position shows significant bias toward 
the H. annuus variant, while the other shows bias toward 
the H. petiolaris variant) flagged. To assess the general 
level of transcript level variation due to cis regulatory di- 
vergence between parental genomes versus transacting 
regulators, we examined the overlap between reference 
contigs with one or more fixed SNP showing significant 
differences in transcript accumulation between parental 
accessions and those showing significant allelic bias in Fl 
hybrids. We also fitted a linear model to predict the mag- 
nitude of allelic bias based on the observed difference in 
transcript level between parental accessions. 

Classification and annotation of transcripts 

Contigs with transcript accumulation patterns sugges- 
ting non-additive interactions between parental genomes 
within hybrid individuals, as revealed by the analyses 
described above, were labeled as 'non-additive', 'trans- 
gressive', 'high variance', or allelic bias'. We identified 
non-additive transcripts as those showing significant 



deviation of mean transcript levels in hybrid plants from 
combined mean transcript accumulation of parental ac- 
cessions (e.g. those hybrid transcript values falling out- 
side the 99% confidence interval of the linear model 
associating hybrid transcript with mean parent transcript 
levels). Transcripts labeled 'transgressive' showed mean 
accumulation within the Fl hybrids that was signifi- 
cantly greater or less than the mean values observed for 
both H. annuus and H. petiolaris. While transgressive 
levels of transcript accumulation should also be des- 
cribed as non-additive, these two categories do not fully 
overlap due to the differences in analyses used to de- 
fine them. Situations where the difference between 
H. annuus and H. petiolaris is large or there is variation 
in transcript abundance within parental accessions may 
broaden the confidence interval encompassing additive' 
values for Fls, despite Fl means significantly differing 
from both parents. 'High variance' contigs showed esti- 
mates of the coefficient of variation across Fls that were 
greater than 2. The set of reference contigs labeled al- 
lelic bias' contained at least one SNP that distinguished 
the two parental alleles (see criteria above) with variants 
represented in mapped cDNA sequence reads at a ratio 
significantly different from equality. 

Potential functions of reference transcript contigs iden- 
tified as non-additive in Fls according to any of the above 
criteria were explored via analysis of similarity to pub- 
lished protein (NCBI non-redundant protein RefSeq 
[37]) and nucleotide databases using BLASTX and 
BLASTN from NCBI-BLAST + (http://www.ncbi.nlm.nih. 
gov/books/NBK1763/), filtering results with e-values 
greater than le-10. Analyses of gene ontology (GO) for 
contigs of interest were performed using GOrilla, with re- 
finement via ReviGO [38,39]. For each gene list, contigs 
showing significant similarity to Arabidopsis thaliana 
TAIR10 sequences with GO annotations were com- 
pared to three separate similarly-sized lists of contigs 
randomly drawn from the H. annuus reference tran- 
script set using a hypergeometric distribution (modi- 
fied Fisher's Exact Test). GO processes found to be 
significantly (FDR- adjusted p-value < 0.05) over-repre- 
sented in all three analyses are discussed. 

Results 

Fl seeds derived from fertilization of H. annuus cmsHA89 
with H. petiolaris Pet2152 pollen germinated with 88% 
success. Fl plants exhibited intermediate phenotypes with 
respect to parental accessions for all quantitative traits 
measured, except days to flowering, where Fl plants flo- 
wered, on average, earlier than plants of either parental 
accession (Table 1). Fl plants showed 1:1 segregation in 
production of pollen, suggesting that the Pet2152 pollen 
parent was heterozygous for a nuclear fertility-restoring 
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Table 1 External phenotypes of H. annuus cmsHA89, H. petiolaris Pet2152, and F1 hybrid offspring 





Pollen 


Days to flowering 3 


Number of flowers b 


Flower diameter (mm) c 


Trichome density d 


Branching 6 


H. onnuus cmsHA89 


N 


67.3 (2.0) 


1.2 (0.7) 


43.3 (12.3) 


437.4 (180.5) 


0 (0) 


H. petiolaris Pet21 52 


Y 


62 (6.2) 


48.1 (7.3) 


15.8 (2.5) 


0 (0) 


8.7 (1.2) 


F1 (mean) 


Y/N 


54.4 (2.0) 


15.5 (7.2) 


29.3 (4.6) 


48.3 (40.9) 


5.9 (1.7) 


F1.01 


N 


52 


11 


32.3 


31.5 


7 


F1.03 


N 


52 


22 


30.2 


59.3 


8 


F1.04 


N 


55 


5 


29.2 


25.9 


5 


F1.07 


Y 


55 


14 


28.5 


53.7 


7 


F1.14 


N 


55 


11 


26.9 


36.1 


6 


F1.17 


Y 


53 


16 


25.8 


25.9 


7 


F1.TA 


Y 


54 


17 


21.6 


51.9 


6 


F1.TB 


N 


55 


17 


24.8 


11.1 


4 



Data provided are mean values (standard deviation). Phenotypes for individual F1 plants sampled for transcript sequencing are provided below the mean values 
for the total (n = 52) F1 plants grown. a number of days post-germination when the first flower opened; b number of flowers produced by the plant; c disk diameter 
for the first flower (does not include ray flowers); d foliar glandular trichomes per cm 2 leaf abaxial surface; e total number of lateral branches > 2 mm in thickness. 



locus complimentary to the cytoplasmic male sterility pre- 
sent in cmsHA89. 

RNA extraction and Illumina shotgun sequencing of 
cDNA were performed for eight Fl plants as well as two 
plants from each parental accession, generating an ap- 
proximate average of 27 million 100 bp paired- end reads 
per sample. Linear modeling of sequence output showed 
no significant difference among accessions in the num- 
ber of reads generated per sample (F = 0.0728, p = 0.93, 
R 2 for the model = 0.0159). However, a significantly smal- 
ler percentage of Pet2152 reads were successfully mapped 
to the H. annuus-denved reference transcript dataset 
when compared to HA89: 51.94 (± 0.90) vs. 58.26 (± 0.37) 
percent mapped, F = 4.826, p(model) = 0.037, p(HA89- 
Pet2152 * 0) = 0.013, R 2 for the model = 0.5175. Sequence 
reads obtained from Fl hybrid plants mapped to the re- 
ference with intermediate success: 55.05 ± 2.27 percent 
mapped, p(HA89-Fl * 0) = 0.077. Of 16312 contigs con- 
tained in the reference transcript set, approximately 2.5% 
had no reads mapped from the combined 12 samples and 
7.5% (1220 contigs) had a per-sample average depth of less 
than ten sequence reads. 

Examining the relationships among samples for tran- 
script accumulation levels over the entire transcript re- 
ference via both Spearman correlation and principle 
components analyses showed that two samples, HA89.9 
and Fl.TA, grouped together rather than with other 
HA89 or Fl -derived samples. The average coefficient of 
the pair- wise correlation between Fl.TA and other Fl 
samples was 0.818, while the range of correlation coeffi- 
cients (R 2 ) for comparisons among Fl (excluding Fl.TA) 
was 0.977-0.998. Similarly, while cmsHA89 samples were 
significantly and positively correlated (R 2 = 0.755), tran- 
script levels showed higher similarity between HA89.9 
and Fl.TA (R 2 = 0.977). As patterns of sequence poly- 
morphism, in addition to earlier genotyping, confirmed 



that these samples were identified correctly, we hypothe- 
size that uncontrolled environmental factors influenced 
transcript accumulation patterns in these two plants, 
despite our attempts to maintain similar conditions. In 
particular, when compared to all other samples these 
plants show relatively reduced accumulation of trans- 
cripts involved in photosynthetic processes, and relative 
increases in accumulation of transcripts associated with 
defense and stress responses. To avoid excessive influ- 
ence of these samples on interpretation of our data, we 
conducted analyses of both the complete data (n = 12) 
and a dataset from which HA89.9 and Fl.TA had been 
removed (n = 10). Results presented below show the 
overlap between these two analyses; differences between 
analyses of complete and reduced datasets are shown in 
Additional file 4: Figure SI. 

Interspecific differences in transcript accumulation 

Differential transcript accumulation was assessed via pair- 
wise comparison of transcriptome shotgun sequence 
from H. annuus cmsHA89, H. petiolaris Pet2152 and 
cmsHA89 x Pet2152 Fl hybrids. Per-contig transcript 
accumulation, measured as mapped sequence reads per 
contig, differed between accessions for 1456 (cmsHA89 
vs. Pet2152), 125 (cmsHA89 vs. Fl), and 1555 (Pet2152 
vs. Fl) transcripts, using a q-value (adjusted for mul- 
tiple comparisons) of 0.01 (Figure 2; Additional file 5: 
Table SI). High variance between the cmsHA89 samples 
(discussed above) likely contributes to the lower number 
of significantly-differing transcripts detected in com- 
parisons involving this accession. A greater number of 
contigs (64.7%) showed elevated transcript accumulation 
in cmsHA89 (943 contigs) versus Pet2152 (513 contigs). 
These included 169 contigs with no reads mapped from 
one accession, with 123 of these containing no mapped 
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Figure 2 Heatmap of z-score normalized means of mapped 
read counts for 1488 genes significantly differing in transcript 
accumulation (q-value < 0.01) for at least one pairwise 
comparison among H. annuus cmsHA89 (A), H. petiolaris 
Pet2152 (P), and cmsHA89 x Pet2152 F1 interspecific hybrids 
(F1). Only transcripts with significant differences conserved between 
full and reduced analyses are shown. Accession groups are 
shown as columns. Individual transcripts are arrayed in rows; a 
list of reference identifiers, mean read counts, and annotation by 
similarity to published sequences is provided as Additional file 5: 
Table S1. Shading indicates lower (lighter) or higher (darker) 
relative transcript values. 



transcript reads from Pet2152. A similar bias was obser- 
ved in comparisons of cmsHA89 to Fl hybrids, as 80 of 
125 contigs significantly differing in transcript accumula- 
tion (64%) showed elevated counts in HA89 samples. 
However, comparison of Fls with Pet2152 showed greater 
similarity in the numbers of transcripts elevated for each 
accession, with approximately 54% (835 contigs) showing 
higher transcript accumulation in H. petiolaris samples 
relative to Fl samples. 

Non-additive transcript accumulation in Fl hybrids 

Fl plants showed intermediate levels of transcript ac- 
cumulation for more than 99% of these comparisons 
(Figure 3). A linear model of mean transcript accumula- 
tion across Fl plants as a function of the mean of paren- 
tal samples explained a high proportion of Fl transcript 
variance (p < 0.0001, R 2 = 0.98 (reduced data), p < 0.0001, 
R 2 = 0.96 (full data)). Slope and intercept were estimated 
as 1.09 (±0.0011) and 2.278 (±7.384) (reduced dataset), 
respectively. Modeling transcript accumulation for in- 
dividual Fl plants against midparent values generated 
model R 2 values ranging from 0.908 to 0.947, with the 
exception of the outlier sample Fl.TA, where midparent 
values explained only one third of transcript level var- 
iance. Only 159 contigs had mean hybrid read counts 
outside the 99% confidence intervals generated for 
predicted hybrid transcript values in both analyses 
(Additional file 5: Table SI). The mean transcript ac- 
cumulation estimates for these contigs in Fl plants 



CD 
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+ 

CD 



95% confidence interval for midparent y 
transcript accumulation 




1e+00 



1e+02 



1e+04 
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midparent transcript accumulation 

Figure 3 Combined mean transcript accumulation (count of 
mapped reads) for parental accessions H. annuus cmsHA89 and 
H. petiolaris Pet2152 plotted on the horizontal axis against 
mean transcript accumulation in Fl hybrid plants (vertical axis). 

Each point represents one of 16,312 contigs in the reference 
transcript set. Values on both axes are plotted on a log 10 scale. 
Dotted lines indicate the 99% confidence interval for F1 hybrid 
transcript accumulation predicted by the linear model: F1 transcript 
accumulation = SLOPE*mean parental transcript accumulation + 
INTERCEPT (p < 0.0001 , R 2 = 0.98). 



ranged from 1.2 to 2372% of predicted values, roughly 
evenly divided between those above (44%) or below (56%) 
the parental mean. 

Analysis of gene ontologies (GO) assigned to these 
transcripts indicated significant overrepresentation of 
160 GO processes, reduced to 64 by collapsing highly 
redundant categories. Among these GO terms, two 
groups were prominent, involving photosynthesis and 
energy metabolism (including photosystem assembly, 
chlorophyll biosynthesis, plastid localization, pentose- 
phosphate shunt, electron transport) and defense res- 
ponse (including salicylic acid biosynthesis, regulation 
of hypersensitive response, MAPK cascades, jasmo- 
nate signaling). 

Transgressive transcript accumulation in Fl hybrids 

While the majority of contigs examined showed inter- 
mediate levels of transcript accumulation in Fl plants 
relative to parent accessions, 10 contigs consistently 
showed transcript accumulation significantly greater or 
less than values observed in cmsHA89 or Pet2152 sam- 
ples (Table 2). Of these, 8 transcripts showed higher ac- 
cumulation in hybrids than parental accessions, a bias 
that is maintained when the criteria for identifying 
contigs as significantly transgressive are relaxed to in- 
clude contigs significant in only one of the two analyses 
(full or reduced dataset): 52/65 identified contigs 
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Table 2 Contigs showing significantly transgressive transcript phenotypes across F1 H. annuus x H. petiolaris hybrids in 
both full and reduced analyses 
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(Reduced analyses omit outlier samples HA89.9, F1.TA.) Mean transcript accumulation (RPKM) per accession (A = H. annuus cmsHA89, H : 
Pet2152) are provided for both full ("full") and reduced ("red") analyses, 'summary annotation' summarizes the hypothesized gene function 
similarity. Details of BLAST analyses are provided in Additional file 6: Table S2. 



= F1 hybrid, P = H. petiolaris 
based on BLAST-identified 



showed higher transcript accumulation in Fls under 
these relaxed criteria (Additional file 6: Table S2). In 
addition, 50 reference contigs identified as showing 
non-additive Fl transcript accumulation had Fl mean 
values either higher (41) or lower (9) than both parent 
mean values (Additional file 5: Table SI). 

Variance among Fl hybrids 

The Fl plants examined in this study are the product of 
hybridization between an inbred H. annuus domesti- 
cated line and a wild-collected H. petiolaris accession 
that is highly heterozygous. Both external and transcript- 
level phenotypes evaluated in Fl plants were largely 
intermediate with respect to the parental accessions, yet 
did show variation among Fls (Table 1). Transcript se- 
quences were obtained from eight individual Fl plants, 
allowing us to evaluate inter-plant variation in transcript 
accumulation that may be attributable to interaction 
with segregating regulatory loci in a parental genome. We 
calculated the coefficient of variation for each transcript 
across all Fl plants (Figure 4A). 166 contigs (approxi- 
mately 1% of the reference transcriptome) consistently 
had a CV greater than 2 for Fl samples (Additional file 5: 
Table SI). These mainly included genes controlling cell 
division and DNA synthesis. When these contigs are 
subjected to hierarchical clustering, Fl samples form 
one relatively uniform group resembling H. petiolaris 
samples, and one more variable group including 
H. annuus (Figure 4B, Additional file 7: Figure S2). 
This is consistent with segregation of parental alleles 
associated with regulation of these transcripts. 

Allelic bias in transcript accumulated in Fl hybrids 

13,734 fixed single nucleotide variants (SNP) were iden- 
tified between H. annuus and H. petiolaris transcript 
reads, contained within 3,393 contigs. These SNPs were 
distributed across the genome, with a high correlation 



between the density of SNPs detected and the overall 
abundance of sequence reads mapped to a given genetic 
position (Spearman correlation: R 2 = 0.93) (Figure 1). It 
was therefore possible to distinguish between parent 
genome contributions to the Fl hybrid transcript pool 
for approximately 20% percent of the reference trans- 
cript set. The average Spearman correlation coefficient 
of transcript levels between alleles within individual Fl 
samples was 0.82 (±0.015). We identified 1,363 poly- 
morphic sites within 681 contigs where allelic variants 
derived from the two parental genomes were detected in 
significantly different quantities, indicating allelic bias in 
transcript accumulation. For the majority of transcripts, 
the magnitude of the allelic bias is relatively small, with 
the dominant allele present at approximately twice the 
level of the alternate parental allele (Figure 5). 

Of the 681 contigs containing at least one SNP 
showing significant bias in Fls, only 81 were also identi- 
fied as showing significant differences in transcript ac- 
cumulation between parental accessions cmsHA89 and 
PET2152 (Figure 5). A conservative estimate based on 
overlap of both full and reduced analyses indicates that 
1456 (or 9% of) reference transcripts examined differ in 
transcript accumulation between parental accessions. 
Differences between parental accessions were consistent 
with differences between alleles in 65% of the contigs 
showing significant differences in both sets of analyses. 
All inconsistent contigs showed significantly higher 
levels of accumulation in H. petiolaris (compared to H. 
annuus) samples but significantly lower accumulation of 
transcripts bearing H. petiolaris alleles in samples from 
hybrids. Across all contigs showing significant evidence 
of allelic bias, almost 77% (523/681) showed higher tran- 
script levels in H. petiolaris samples than H. annuus 
samples, suggesting that the larger number of trancripts 
observed showing bias toward the H. annuus allele in 
hybrid samples is not simply explained by preferential 
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Figure 4 Variation in transcript accumulation among Fl hybrid individuals. A) Distribution of the coefficient of variation (CV) for 16,312 
transcripts analyzed from F1 plants; grey = CV calculated from full data, light blue = CV calculated from reduced data (minus F1.TA), darker blue 
indicates overlap. B) Heatmap showing z-score normalized transcript accumulation for 166 reference contigs with CV>2 within F1 samples. 
Samples from individual plants are shown in horizontal rows; F1 = hybrid Fl, A = H.onnuus cmsHA89, P = H. petioloris Pet21 52. Hierarchical 
clustering estimated from Spearman correlation coefficients for pairwise contig (x-axis) and sample (x-axis) distance matrices. The colored bar 
along the top edge indicates assignment of transcripts to GO Biological Process groups, with prominent categories: green (cell cycle/mitosis), 
yellow (histones/chromatin modification), blue (metabolism), red (stress/defense), and pink (transcription factors/signaling). 



alignment of transcript sequence reads to the H. annuus- 
based reference transcriptome. 

Discussion 

Gene expression changes associated with hybridization 
may be attributed to a variety of factors. Novel gene 
combinations, chromosomal rearrangements, increases 
in transposon activity, and changes in DNA methylation 
status occur in interspecific hybrids and are likely to 
affect gene expression [30,31,40-43]. The transcriptional 
phenotypes of first generation hybrids should predomin- 
antly reflect the basic interaction of parental genomes 
and their endogenous regulatory factors. Deviation of Fl 
transcript accumulation from midparent values (expec- 
ted if parental genome contributions to hybrid transcript 
accumulation were purely additive) will reflect epistatic 
and dominance interactions between parental genomes. 

The transcript patterns observed in annual sunflower 
hybrids in this study differ from other systems used 
to study homoploid hybridization in experimental set- 
tings, such as Drosophila interspecies hybrids and re- 
synthesized Senecio squalidus, where relatively high 
proportions of transcripts examined showed "misexpres- 
sion" or allelic bias [14,15,44]. Various lines of evidence 
suggest that H. annuus and H. petiolaris have experienced 
substantial levels of recent genetic exchange, in several 
instances resulting in ecologically mediated formation of 
hybrid species [22,25,45-48]. While reduced divergence 
through introgression might be expected to increase ge- 
nomic compatibility, selection for hybrid viability should 
also select against extreme levels of genomic misregula- 
tion. In this study, we have selected not merely for strict 



viability, but for growth beyond the seedling stage. It 
remains possible that regulatory incompatibilities have 
greater impact on early stages of growth and development, 
or specifically in reproductive tissues, and thus are not 
detected in this study, which, as is generally true for ana- 
lyses of transcript accumulation, can only provide a snap- 
shot of the continuous flow of transcript production and 
degradation. In this experiment, we also observed strong, 
uncontrolled environmental effects on transcript profiles 
that led to a loss of experimental power, most prominently 
affecting our ability to confidently identify transcriptional 
differences between H. annuus cmsHA89 and H. petio- 
laris PET2152 or Fl hybrids. Comparisons between 
H. petiolaris and Fl, or within Fl, are relatively unaffected. 
While this means that we may underestimate trans- 
criptional divergence of Fl from the maternal parent, a 
broader implication is that uncontrolled environmental 
factors can have dramatic effects on transcription. The 
distribution of random effects within the generally 
resource-limited designs of many transcriptional pro- 
filing experiments may have profound effects on the 
conclusions drawn from these experiments, which would 
be exacerbated by genotype by environment interaction. 

It is believed that formation of Helianthus hybrid spe- 
cies has been mediated by environmental selection on 
transgressive phenotypes generated through segregation 
of parental genomes [25,26,28]. At the same time, in- 
teracting parental genomes present in early generation 
hybrids must generate phenotypes with sufficient fitness 
to survive beyond the initial hybrid generation for novel 
segregants to appear. Naturally- occurring hybrid indi- 
viduals, as well as laboratory-derived first-generation 
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Figure 5 Magnitude of significant transcript accumulation differences observed between parental accessions H. annuus cmsHA89 and 
H. petiolaris Pet2152 (A, C) and between parental alleles within F1 hybrids (B, D). Horizontal axes show the log 2 fold-change in transcript 
accumulation associated with a shift from H. onnuus to H. petiolaris, thus positive values indicate relatively higher levels of transcript accumulation 
in H. petiolaris (A, C) or of the H. petiolaris allele within the F1 (B, D). Vertical axes show the number of contigs showing statistically significant 
differences between accessions or alleles (adjusted p-value < 0.01). Panels A and B show the distribution of all significant results, while panels 
C and D show only contigs from the reference dataset that show significant differences in transcript levels both between accessions and 
between alleles. 



hybrids, appear to exhibit intermediate phenotypes for 
many morphological and phenological traits (Table 1) 
[19-21,49]. This study suggests that H. annuus x H. 
petiolaris Fl hybrids also exhibit quantitatively interme- 
diate phenotypes at the level of transcript accumulation, 
reflecting widespread compatibility between diverged par- 
ental transcript regulatory networks. The small sample 
sizes for parental accessions in this study may have hin- 
dered detection of transgressive transcription in Fl hy- 
brids, through increased uncertainty regarding actual 
parental transcript levels. Our approach still provides an 
improvement in estimating parental transcription over 
strategies employing pooled samples, and focusing sam- 
pling effort on individual Fls has provided more reliable 
estimates of both the mean and variance of transcript 
levels in hybrids. 

Although Senecio aethnensis and S. chrysanthemifolius 
(the parents of the homoploid hybrid S. squalidus) form a 



well-established hybrid zone with evidence of substantial 
gene flow between species, a much larger percentage of the 
analyzed transcript of first generation hybrids showed evi- 
dence of non-additive (4.9%) or transgressive (3.2%) accu- 
mulation [14,50,51]. The relative scarcity of non-additive 
(0.97-1.28%) or transgressive (0.06-0.7%) transcriptional 
phenotypes in this study might be attributed to differences 
in methodology. Quantification of transcript levels via 
sequencing techniques, rather than hybridization-based 
microarray platforms, allow both examination of a broader 
array of transcripts and greater sensitivity to detect low- 
level transcripts (without necessarily increasing statistical 
power to detect differences in these transcripts). Analysis of 
individual Fl plants allowed assessment of transcript vari- 
ance among hybrids; distributions of hybrid transcript 
levels may suggest different relationships to parental tran- 
script levels than values generated from pooled hybrid sam- 
ples. Differences in the historic patterns of hybridization 
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and selection, or in the phylogenetic distance between 
hybridizing species (much greater for H. annuus and 
H. petiolaris than between the two sister Senecio species in 
question), might also account for the different outcomes 
observed in these two hybrid systems. In particular, the fe- 
male parental lineage of the hybrids examined in this study 
is a product of modern breeding, which has included 
hybridization with wild sunflowers [52]. 

Less than 1% of analyzed hybrid transcripts levels fell 
outside the predictive 99% confidence interval based on av- 
eraged transcript levels from parental accessions (Figure 3). 
Thus, non-additive transcript levels in these Fl plants 
are detected at a frequency indistinguishable from that 
expected by chance. These transcripts do, however, show 
significant over-representation of transcripts predicted to 
function in the broad categories photosynthesis/energy 
metabolism and response to biotic stimulus. In particular, 
transcripts participating in photosynthetic and energy pro- 
cesses are likely to be influenced by interaction with cyto- 
plasmic components, even if the genes themselves are 
transmitted through nuclear inheritance. These transcripts 
also accumulate to high levels, potentially increasing the 
relative statistical power to identify variance from expected 
transcript values. The GO terms associated with the 
group of non-additively accumulated transcripts puta- 
tively involved in responses to biotic stimuli include 
defense response to bacterium, salicylic acid biosyn- 
thesis and metabolism, systemic acquired resistance, 
and MAPK cascade signaling. Misregulation or allelic 
incompatibility of genes involved in plant immune re- 
sponses, particularly related to specific recognition of 
biotrophic pathogens, has been implicated in hybrid ne- 
croses (an extreme example of hybrid genome incom- 
patibility) in Arabidopsis thaliana, lettuce, and wheat 
[53-55]. The hybrid plants in this study showed no obvious 
sign of hybrid necroses under relatively benign growth 
conditions, and rigorous examination of the phenotypic 
consequences of altered transcript levels for these 
immunity-associated genes will be necessary to determine 
whether immune incompatibilities are likely to have signifi- 
cant evolutionary consequences for Helianthus hybrids. 

Interspecific hybridization presents the opportunity to 
distinguish the effects of nucleotide sequence varia- 
tion associated with the transcript site {cis variation) and 
polymorphism in transacting regulatory factors. Vari- 
ation in transcript accumulation between parental acces- 
sions that is caused by polymorphism in transacting 
factors should be diminished in hybrid individuals where 
transcription factors from both genomes are present. 
The allelic bias detected in Fl hybrids suggests that 
many differences observed between parental accessions 
are attributable to cis variation, although the magnitude 
of allelic bias is generally smaller than the difference in 
transcript levels observed between parental accessions. 



The observed expression patterns might therefore be a 
product of regulatory interaction within or between loci. 

Analyses of gene ontology indicated that the group of 
transcripts showing significant allelic bias is enriched for 
processes including chloroplast organization, energy me- 
tabolism, translation, rRNA processing, and biosynthe- 
sis of isopentenyl diphosphate via the non-mevalonate 
(plastid-based) pathway. As these processes all involve 
cytoplasmically-inherited cellular components, it is plau- 
sible that nuclear-cytoplasmic interactions drive the 
allelic biases in transcript accumulation observed in hy- 
brids. Despite H. annuus serving as the maternal parent 
of the hybrids, all over-represented gene groups exa- 
mined contained a mixture of transcripts showing over- 
representation of H. annuus or H. petiolaris alleles. 

The extent of cis regulatory differences between 
H. annuus and H. petiolaris transcripts is likely under- 
estimated in the approach presented here. The criteria 
for selection of variants used to assign parentage to tran- 
scripts within Fl individuals excludes both loci lacking 
mapped transcript reads from either parental accession 
and loci that are polymorphic within either parental ac- 
cession. While, on average, approximately 130,000 high- 
confidence heterozygous sites were identified per Fl 
individual, parentage could only be reliably assigned for 
a fraction of these. In addition, transcripts affected by 
polymorphism in cis regulatory sequences, but lacking 
consistent sequence polymorphism between parental ac- 
cessions within the actual transcripts, will not be detec- 
ted as showing allelic bias, although transcripts from 
such loci may be preferentially derived from one paren- 
tal genome [16,56,57]. A relative lack of fixed poly- 
morphism in transcripts showing expression differences 
between parental accessions contributes to the minimal 
overlap observed between transcripts showing allelic bias 
and those showing differences in accumulation between 
parental accessions. This suggests that cis and trans 
regulation are both influential in first generation hybrids, 
but confidently apportioning their relative effects will re- 
quire additional data from non-coding regulatory regions. 

Conclusion 

Studies typically focus on the extreme consequences of 
hybridization, both the good (heterosis) and the 
bad (genomic incompatibilities and hybrid necroses) 
[53-55,58-61]. This study, in contrast, detects few ex- 
treme transcript phenotypes in hybrid offspring of two 
annual sunflower species that show evidence of exten- 
sive gene flow since their divergence. Comparison of 
additional hybrid transcriptomes from crosses of wild 
sympatric and allopatric H. annuus and H. petiolaris, 
particularly incorporating a range of tissues and devel- 
opmental stages, may clarify the role that introgression 
plays in transcriptional compatibility. 
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NCBI Sequence Read Archive under accession num- 
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Additional file 1: Supplemental Methods 1. Preliminary confirmation 
of hybrid identity of F1 plants via PCR based genetic markers. 

Additional file 2: Table S3. Genetic positions of reference transcripts 
showing identity to sequenced markers appearing on a map of 
H. annuus derived from recombinant inbred lines from the population 
RHA280 x RHA801 (unpublished). 

Additional file 3: Supplemental Methods 2. Perl script used to extract 
informative single nucleotide variants for analysis of allelic bias in hybrid 
transcript accumulation. 

Additional file 4: Figure SI. Venn diagrams showing overlap 
between full (all data) and reduced (minus outlier samples HA89.9 
and F1.TA) analyses. 

Additional file 5: Table SI. All reference transcripts showing at least 
one significant difference in analyses of species differences between 
H. annuus cmsHA89 and H. petiolaris Pet21 52 or analyses of 
transgressive, non-additive, high variance, or allele-biased transcripts in 
F1 hybrids. 'REFERENCE' identifies the reference contig. 'LENGTH' gives 
the length of the reference contig in bases. Black or grey symbols 
within the following columns indicate whether the specified difference 
was statistically significant in both full and reduced analyses (black/ 
bold) or only a single analysis (grey). For "TRANSGRESSIVE" and 
"NON-ADDITIVE" transcripts, 'A 'indicates that F1 samples showed a 
mean transcript accumulation greater than observed for either parental 
accession; '▼' indicates lower levels of transcript in F1 samples. For 
"NON-ADDITIVE" transcripts, '•' indicates that F1 transcript 
accumulation was intermediate relative to parental accessions. For 
"ALLELIC BIAS" and "SPECIES DIFFERENCE", 'A' and 'P' indicate that 
higher transcript accumulation was observed for the H. annuus or H. 
petiolaris allele/accession, respectively. For "HIGH CV", '•' indicates a 
contig showing a coefficient of variation among F1 samples that is > 2. 
"TAIR10" provides the best nucleotide BLAST hit to the TAIR10 genome 
assembly (www.arabidopsis.org); "no hit" indicates no results with e- 
value < e-1 0. "UNIPROT" provides the uniprot id for the best BLASTX hit 
against the UniProt Knowledgebase, release 2012_08. "description" 
provides an abbreviated annotation of gene function. 

Additional file 6: Table S2. Transcripts showing transgressive levels of 
accumulation in F1 hybrids in either full or reduced analyses. Mean RPKM 
per accession (A = H. annuus cmsHA89, H = F1 hybrid, P = H. petiolaris 
Pet2152) are provided for both full ("full") and reduced ("red") analyses. 'SIG' 
indicates whether a given transcript shows significant transgression in 
'FULL', 'REDUCED', or 'BOTH' analyses, with 'FULL(TA)' indicating transcripts 
that were transgressive in full analyses due to inflation of the F1 mean 
transcript estimates by the sample F1.TA. Transgressive' indicates whether 
F1 transcript levels were determined to be high or low compared to 
parental accessions. 'Summary_Annotation' summarizes the hypothesized 
gene function based on BLAST-identified similarity. Annotation of the best 
protein BLAST hit is provided, along with the GenBank identifier and e-value 
for the BLAST hit, in subsequent columns. 

Additional file 7: Figure S2. Addendum to Figure 4b, showing full data 
(outlier samples HA89.9 and F1.TA were not included in the main 
manuscript figure). This heatmap shows z-score normalized transcript 
accumulation for 166 reference contigs with CV > 2 within F1 samples. 
Samples from individual plants are shown in horizontal rows. Hierarchical 
clustering estimated from Spearman correlation coefficients for pairwise 
contig (x-axis) and sample (x-axis) distance matrices. The colored bar along 
the top edge indicates assignment of transcripts to GO Biological Process 
groups, with prominent categories: green (cell cycle/mitosis), yellow 
(histones/chromatin modification), blue (metabolism), red (stress/defense), 
and pink (transcription factors/signaling). 
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